101 research outputs found

    On The Hereditary Discrepancy of Homogeneous Arithmetic Progressions

    Full text link
    We show that the hereditary discrepancy of homogeneous arithmetic progressions is lower bounded by n1/O(log⁑log⁑n)n^{1/O(\log \log n)}. This bound is tight up to the constant in the exponent. Our lower bound goes via proving an exponential lower bound on the discrepancy of set systems of subcubes of the boolean cube {0,1}d\{0, 1\}^d.Comment: To appear in the Proceedings of the American Mathematical Societ

    Approximating Hereditary Discrepancy via Small Width Ellipsoids

    Full text link
    The Discrepancy of a hypergraph is the minimum attainable value, over two-colorings of its vertices, of the maximum absolute imbalance of any hyperedge. The Hereditary Discrepancy of a hypergraph, defined as the maximum discrepancy of a restriction of the hypergraph to a subset of its vertices, is a measure of its complexity. Lovasz, Spencer and Vesztergombi (1986) related the natural extension of this quantity to matrices to rounding algorithms for linear programs, and gave a determinant based lower bound on the hereditary discrepancy. Matousek (2011) showed that this bound is tight up to a polylogarithmic factor, leaving open the question of actually computing this bound. Recent work by Nikolov, Talwar and Zhang (2013) showed a polynomial time O~(log⁑3n)\tilde{O}(\log^3 n)-approximation to hereditary discrepancy, as a by-product of their work in differential privacy. In this paper, we give a direct simple O(log⁑3/2n)O(\log^{3/2} n)-approximation algorithm for this problem. We show that up to this approximation factor, the hereditary discrepancy of a matrix AA is characterized by the optimal value of simple geometric convex program that seeks to minimize the largest β„“βˆž\ell_{\infty} norm of any point in a ellipsoid containing the columns of AA. This characterization promises to be a useful tool in discrepancy theory

    Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

    Full text link
    Consider a database of nn people, each represented by a bit-string of length dd corresponding to the setting of dd binary attributes. A kk-way marginal query is specified by a subset SS of kk attributes, and a ∣S∣|S|-dimensional binary vector Ξ²\beta specifying their values. The result for this query is a count of the number of people in the database whose attribute vector restricted to SS agrees with Ξ²\beta. Privately releasing approximate answers to a set of kk-way marginal queries is one of the most important and well-motivated problems in differential privacy. Information theoretically, the error complexity of marginal queries is well-understood: the per-query additive error is known to be at least Ξ©(min⁑{n,dk2})\Omega(\min\{\sqrt{n},d^{\frac{k}{2}}\}) and at most O~(min⁑{nd1/4,dk2})\tilde{O}(\min\{\sqrt{n} d^{1/4},d^{\frac{k}{2}}\}). However, no polynomial time algorithm with error complexity as low as the information theoretic upper bound is known for small nn. In this work we present a polynomial time algorithm that, for any distribution on marginal queries, achieves average error at most O~(nd⌈k/2βŒ‰4)\tilde{O}(\sqrt{n} d^{\frac{\lceil k/2 \rceil}{4}}). This error bound is as good as the best known information theoretic upper bounds for k=2k=2. This bound is an improvement over previous work on efficiently releasing marginals when kk is small and when error o(n)o(n) is desirable. Using private boosting we are also able to give nearly matching worst-case error bounds. Our algorithms are based on the geometric techniques of Nikolov, Talwar, and Zhang. The main new ingredients are convex relaxations and careful use of the Frank-Wolfe algorithm for constrained convex minimization. To design our relaxations, we rely on the Grothendieck inequality from functional analysis

    The Geometry of Differential Privacy: the Sparse and Approximate Cases

    Full text link
    In this work, we study trade-offs between accuracy and privacy in the context of linear queries over histograms. This is a rich class of queries that includes contingency tables and range queries, and has been a focus of a long line of work. For a set of dd linear queries over a database x∈RNx \in \R^N, we seek to find the differentially private mechanism that has the minimum mean squared error. For pure differential privacy, an O(log⁑2d)O(\log^2 d) approximation to the optimal mechanism is known. Our first contribution is to give an O(log⁑2d)O(\log^2 d) approximation guarantee for the case of (\eps,\delta)-differential privacy. Our mechanism is simple, efficient and adds correlated Gaussian noise to the answers. We prove its approximation guarantee relative to the hereditary discrepancy lower bound of Muthukrishnan and Nikolov, using tools from convex geometry. We next consider this question in the case when the number of queries exceeds the number of individuals in the database, i.e. when d>nβ‰œβˆ₯xβˆ₯1d > n \triangleq \|x\|_1. It is known that better mechanisms exist in this setting. Our second main contribution is to give an (\eps,\delta)-differentially private mechanism which is optimal up to a \polylog(d,N) factor for any given query set AA and any given upper bound nn on βˆ₯xβˆ₯1\|x\|_1. This approximation is achieved by coupling the Gaussian noise addition approach with a linear regression step. We give an analogous result for the \eps-differential privacy setting. We also improve on the mean squared error upper bound for answering counting queries on a database of size nn by Blum, Ligett, and Roth, and match the lower bound implied by the work of Dinur and Nissim up to logarithmic factors. The connection between hereditary discrepancy and the privacy mechanism enables us to derive the first polylogarithmic approximation to the hereditary discrepancy of a matrix AA

    Towards a Constructive Version of Banaszczyk's Vector Balancing Theorem

    Get PDF
    An important theorem of Banaszczyk (Random Structures & Algorithms `98) states that for any sequence of vectors of β„“2\ell_2 norm at most 1/51/5 and any convex body KK of Gaussian measure 1/21/2 in Rn\mathbb{R}^n, there exists a signed combination of these vectors which lands inside KK. A major open problem is to devise a constructive version of Banaszczyk's vector balancing theorem, i.e. to find an efficient algorithm which constructs the signed combination. We make progress towards this goal along several fronts. As our first contribution, we show an equivalence between Banaszczyk's theorem and the existence of O(1)O(1)-subgaussian distributions over signed combinations. For the case of symmetric convex bodies, our equivalence implies the existence of a universal signing algorithm (i.e. independent of the body), which simply samples from the subgaussian sign distribution and checks to see if the associated combination lands inside the body. For asymmetric convex bodies, we provide a novel recentering procedure, which allows us to reduce to the case where the body is symmetric. As our second main contribution, we show that the above framework can be efficiently implemented when the vectors have length O(1/log⁑n)O(1/\sqrt{\log n}), recovering Banaszczyk's results under this stronger assumption. More precisely, we use random walk techniques to produce the required O(1)O(1)-subgaussian signing distributions when the vectors have length O(1/log⁑n)O(1/\sqrt{\log n}), and use a stochastic gradient ascent method to implement the recentering procedure for asymmetric bodies
    • …
    corecore